Scaling up cosine interesting pattern discovery: A depth-first method
نویسندگان
چکیده
This paper presents an efficient algorithm called CosMinert for interesting pattern discovery. The widely used cosine similarity, found to possess the null-invariance property and the anti-cross-support-pattern property, is adopted as the interestingness measure in CosMinert . CosMinert is generally an FP-growth-like depth-first traversal algorithm that rests on an important property of the cosine similarity: the conditional anti-monotone property (CAMP). The combined use of CAMP and the depth-first support-ascending traversal strategy enables the pre-pruning of uninteresting patterns during the mining process of CosMinert . Extensive experiments demonstrate the high efficiency of CosMinert in interesting pattern discovery, in comparison to the breath-first strategy and the postevaluation strategy. In particular, CosMinert shows its capability in suppressing the generation of cross-support patterns and discovering rare but truly interesting patterns. Finally, an interesting case of landmark recognition is presented to illustrate the value of cosine interesting patterns found by CosMinert in real-world applications. 2014 Elsevier Inc. All rights reserved.
منابع مشابه
Abstract—Coordinate Rotation Digital Computer (CORDIC) algorithm is an established method in complex arithmetic function discovery using shift and add operations. An absolute Scaling-free CORDIC algorithm for cosine and sine function computation function
Coordinate Rotation Digital Computer (CORDIC) algorithm is an established method in complex arithmetic function discovery using shift and add operations. An absolute Scaling-free CORDIC algorithm for cosine and sine function computation function has been implemented. A combination of third order approximation Taylor series and leading-one-bit detection algorithm has been adopted in this impleme...
متن کاملA novel noise filter based on interesting pattern mining for bag-of-features images
Improving the quality of image data through noise filtering has gained more attention for a long time. To date, many studies have been devoted to filter the noise inside the image, while few of them focus on filtering the instance-level noise among normal images. In this paper, aiming at providing a noise filter for bag-of-features images, (1) we first propose to utilize the cosine interesting ...
متن کاملScaling up all pairs similarity search pdf
Given a large collection of sparse vector data in a high dimensional space, we investigate the problem of finding all pairs of vectors whose similarity.ABSTRACT. Given a large collection of sparse vector data in a high dimensional space, we investigate the problem of finding all pairs of vectors whose similarity. Scaling up all pairs similarity search, Published by ACM. The problem of finding a...
متن کاملFlexible Clustering by Tendency in High Dimensional Space
Clustering is the process of grouping a set of objects into classes of similar objects. Until recently, the concept of similarity is based on distances, e.g Euclidean distance and cosine distance. Our previous work on δ-cluster and δ-pCluster designed new similarity models to capture subspace coherency exhibited in data and focused on shifting patterns or scaling patterns. Along the same genera...
متن کاملTransformation-Based Learning meets Frequent Pattern Discovery
Transformation-based learning (TBL) and frequent pattern discovery (FPD) are two popular research paradigms, one from the domain of empirical natural language processing , the second from the eld of data mining. This paper describes how Eric Brill's original TBL algorithm can be improved via incorporation of FPD techniques. The algorithm B-Warmr is presented that upgrades TBL to rst-order logic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Sci.
دوره 266 شماره
صفحات -
تاریخ انتشار 2014